Evaluating Cross-Language Explicit Semantic Analysis and Cross Querying at TEL@CLEF 2009

نویسندگان

Maik Anderka

Nedim Lipka

Benno Stein

چکیده

This paper describes our participation in the TEL@CLEF task of the CLEF 2009 adhoc track. The task is to retrieve items from various multilingual collections of library catalog records, which are relevant to a user’s query. Two different strategies are employed: (i) the Cross-Language Explicit Semantic Analysis, CL-ESA, where the library catalog records and the queries are represented in a multilingual concept space that is spanned by aligned Wikipedia articles, and, (ii) a Cross Querying approach, where a query is translated into all target languages using Google Translate and where the obtained rankings are combined. The evaluation shows that both strategies outperform the monolingual baseline and achieve comparable results. Furthermore, inspired by the Generalized Vector Space Model we present a formal definition and an alternative interpretation of the CL-ESA model. This interpretation is interesting for real-world retrieval applications since it reveals how the computational effort for CL-ESA can be shifted from the query phase to a preprocessing phase.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-lingual Information Retrieval based on Multiple Indexes

In this paper we present the technical details of the retrieval system with which we participated at the CLEF09 Ad-hoc TEL task. We present a retrieval approach based on multiple indexes for different languages which is combined with a conceptbased retrieval approach based on Explicit Semantic Analysis. In order to create the language-specific indices for each language, a language detection app...

متن کامل

Cross-language Information Retrieval with Explicit Semantic Analysis

We have participated on the monolingual and bilingual CLEF Ad-Hoc Retrieval Tasks, using a novel extension of the by now well-known Explicit Semantic Analysis (ESA) approach. We call this extension Cross-Language Explicit Semantic Analysis (CL-ESA) as it allows to apply ESA in a cross-lingual information retrieval setting. In essence, ESA represents documents as vectors in the space of Wikipedi...

متن کامل

An Evaluation of Greek-English Cross Language Retrieval within the CLEF Ad-Hoc Bilingual Task

This article describes an experimental investigation on the use of resources from the web on a common Natural Language Problem (NLP) problem that of Word Sense Disambiguation (WSD). In particular we use our disambiguation experiments with statistical query translation on a Greek-English cross language retrieval system using Google’s n-grams. Results from our participation on the Ad-Hoc TEL trac...

متن کامل

Combining Wikipedia-Based Concept Models for Cross-Language Retrieval

As a low-cost ressource that is up-to-date, Wikipedia recently gains attention as a means to provide cross-language brigding for information retrieval. Contradictory to a previous study, we show that standard Latent Dirichlet Allocation (LDA) can extract cross-language information that is valuable for IR by simply normalizing the training data. Furthermore, we show that LDA and Explicit Semanti...

متن کامل

CACAO Project at the TEL@CLEF 2009 Task

This paper presents the participation of the CACAO prototype to the TEL@CLEF 2009 task, an evaluation track focusing on multilingual document retrieval over a collection of library catalogues. CACAO (Cross-language Access to Catalogues And On-line libraries) is an EU project devoted to enabling cross-language access to the contents of a federation of digital libraries with a set of software too...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Evaluating Cross-Language Explicit Semantic Analysis and Cross Querying at TEL@CLEF 2009

نویسندگان

چکیده

منابع مشابه

Cross-lingual Information Retrieval based on Multiple Indexes

Cross-language Information Retrieval with Explicit Semantic Analysis

An Evaluation of Greek-English Cross Language Retrieval within the CLEF Ad-Hoc Bilingual Task

Combining Wikipedia-Based Concept Models for Cross-Language Retrieval

CACAO Project at the TEL@CLEF 2009 Task

عنوان ژورنال:

اشتراک گذاری